Goto

Collaborating Authors

 technical knowledge


Measuring and Augmenting Large Language Models for Solving Capture-the-Flag Challenges

arXiv.org Artificial Intelligence

Capture-the-Flag (CTF) competitions are crucial for cybersecurity education and training. As large language models (LLMs) evolve, there is increasing interest in their ability to automate CTF challenge solving. For example, DARPA has organized the AIxCC competition since 2023 to advance AI-powered automated offense and defense. However, this demands a combination of multiple abilities, from knowledge to reasoning and further to actions. In this paper, we highlight the importance of technical knowledge in solving CTF problems and deliberately construct a focused benchmark, CTFKnow, with 3,992 questions to measure LLMs' performance in this core aspect. Our study offers a focused and innovative measurement of LLMs' capability in understanding CTF knowledge and applying it to solve CTF challenges. Our key findings reveal that while LLMs possess substantial technical knowledge, they falter in accurately applying this knowledge to specific scenarios and adapting their strategies based on feedback from the CTF environment. Based on insights derived from this measurement study, we propose CTFAgent, a novel LLM-driven framework for advancing CTF problem-solving. CTFAgent introduces two new modules: two-stage Retrieval Augmented Generation (RAG) and interactive Environmental Augmentation, which enhance LLMs' technical knowledge and vulnerability exploitation on CTF, respectively. Our experimental results show that, on two popular CTF datasets, CTFAgent both achieves over 80% performance improvement. Moreover, in the recent picoCTF2024 hosted by CMU, CTFAgent ranked in the top 23.6% of nearly 7,000 participating teams. This reflects the benefit of our measurement study and the potential of our framework in advancing LLMs' capabilities in CTF problem-solving.


DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

arXiv.org Artificial Intelligence

Designing solutions for complex engineering challenges is crucial in human production activities. However, previous research in the retrieval-augmented generation (RAG) field has not sufficiently addressed tasks related to the design of complex engineering solutions. To fill this gap, we introduce a new benchmark, SolutionBench, to evaluate a system's ability to generate complete and feasible solutions for engineering problems with multiple complex constraints. To further advance the design of complex engineering solutions, we propose a novel system, SolutionRAG, that leverages the tree-based exploration and bi-point thinking mechanism to generate reliable solutions. Extensive experimental results demonstrate that SolutionRAG achieves state-of-the-art (SOTA) performance on the SolutionBench, highlighting its potential to enhance the automation and reliability of complex engineering solution design in real-world applications.


Perceptions of Discriminatory Decisions of Artificial Intelligence: Unpacking the Role of Individual Characteristics

arXiv.org Artificial Intelligence

This study investigates how personal differences (digital self-efficacy, technical knowledge, belief in equality, political ideology) and demographic factors (age, education, and income) are associated with perceptions of artificial intelligence (AI) outcomes exhibiting gender and racial bias and with general attitudes towards AI. Analyses of a large-scale experiment dataset (N = 1,206) indicate that digital self-efficacy and technical knowledge are positively associated with attitudes toward AI, while liberal ideologies are negatively associated with outcome trust, higher negative emotion, and greater skepticism. Furthermore, age and income are closely connected to cognitive gaps in understanding discriminatory AI outcomes. These findings highlight the importance of promoting digital literacy skills and enhancing digital self-efficacy to maintain trust in AI and beliefs in AI usefulness and safety. The findings also suggest that the disparities in understanding problematic AI outcomes may be aligned with economic inequalities and generational gaps in society. Overall, this study sheds light on the socio-technological system in which complex interactions occur between social hierarchies, divisions, and machines that reflect and exacerbate the disparities.


GitHub - divamgupta/diffusionbee-stable-diffusion-ui: Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.

#artificialintelligence

Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed. - GitHub - divamgupta/diffusionbee-stable-diffusion-ui: Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.


GPT-3: The biggest breakthrough in AI in recent history

#artificialintelligence

GPT-1 was released on June 11, 2018. When this model was released by OpenAI, there was much excitement. It was the transformer structure combined with unsupervised pre-training with promising results. The key difference between GPT-1 and the other language-based models before it, is that it was fine-tuned, or trained for specific tasks. GPT-2 was introduced in February 2019.


Two Cheers for the Pentagon's New Data and AI Initiative

#artificialintelligence

The Department of Defense is considering organizational changes designed to create a more integrated approach to data and artificial intelligence, including the creation of a Chief Data and Artificial Intelligence Officer. If the reorganization occurs, the CDAO will oversee several pre-existing offices, including the office of the Chief Data Officer, the Joint Artificial Intelligence Center, and the Defense Digital Service. Consolidated oversight through creating an empowered CDAO could help ensure DoD has the tools it needs to excel and ensure U.S. defense innovation leadership moving forward. Technology leadership requires data and AI leadership, and right now DoD's data and AI efforts are splintered. For example, according to Govini, a decision science company based in Virginia, at least 15 separate institutions within DoD invest to some extent in artificial intelligence, AI adjacent technologies, foundational enabling capabilities for AI, or programs that use AI during development.


Council Post: Understanding The Value Of Artificial Intelligence Solutions In Your Business

#artificialintelligence

Kamales Lardi is CEO of Lardi & Partner Consulting and a global award-winning expert in digital business transformation. In recent years, artificial intelligence (AI) has entered the mainstream business landscape as a critical path to advance in the digital economy. However, there is still a strong misconception in the business world as to what AI is and how it contributes to the digital transformation of an organization. Often, as I work with leadership teams to develop the company's digital transformation strategy, it is not uncommon to find strategic goals such as, "implement AI-based solutions to realize xx% revenue gains," or "implement AI to increase productivity and efficiency." These vague, high-level strategic goals betray a lack of understanding of the capabilities of AI-based solutions and how it could really add value to businesses.


How Vapes Can Be Used For Hacking Computers

#artificialintelligence

The popularity of vaping has been increasing over the past years. Many people that used to smoke traditional cigarettes are now switching to vaping. Some people have even argued that vaping can help a person stop smoking. Some proponents of vaping argue that the practice is less dangerous when compared to smoking conventional tobacco. However, some people have argued that electronic cigarettes create aerosols that can be a health issue.


Humans taught a robot how to be a teaching assistant in just 3 hours

#artificialintelligence

Striking the right balance between robot autonomy and human control is a core challenge in social robotics, in both technical and ethical terms. On the one hand, extended robot autonomy offers the potential for increased human productivity and for the off-loading of physical and cognitive tasks. On the other hand, making the most of human technical and social expertise, as well as maintaining accountability, is highly desirable. This is particularly relevant in domains such as medical therapy and education, where social robots hold substantial promise, but where there is a high cost to poorly performing autonomous systems, compounded by ethical concerns. We present a field study in which we evaluate SPARC (supervised progressively autonomous robot competencies), an innovative approach addressing this challenge whereby a robot progressively learns appropriate autonomous behavior from in situ human demonstrations and guidance. Using online machine learning techniques, we demonstrate that the robot could effectively acquire legible and congruent social policies in a high-dimensional child-tutoring situation needing only a limited number of demonstrations while preserving human supervision whenever desirable. By exploiting human expertise, our technique enables rapid learning of autonomous social and domain-specific policies in complex and nondeterministic environments. Last, we underline the generic properties of SPARC and discuss how this paradigm is relevant to a broad range of difficult human-robot interaction scenarios.


On EducationDigishock 2.0: Machine Learning for Beginners (No Coding) - CouponED

#artificialintelligence

Learn the basics of machine learning without using code Learn to teach a machine with a camera Use an AI platform to build AI Models and Train the datasets Know about IBM Watson & Wipro Holmes AI technologies Convert a web application/software to an app in less than a minute Digishock 1.0 course from Udemy is a must in order to understand the tools better. No other experience or technical knowledge is necessary. This mind-blowing course takes the huge leap from Digishock 1.0 and is for anyone who want to get introduced with Machine Learning and Deep Learning without learning code. This practical hands-on course involves hands-on exercises with numerous tricks and techniques of analytics, advanced predictive concepts to work on to ensure that all are familiarized with the discipline of machine-learning, deep-learning, big data, analytics etc. The USP of the course is that there is no kind of technical knowledge required whatsoever for students who will participate in this course.